arXiv:1508.07190vl [quant-ph] 28 Aug 2015 


Reducing multi-qubit interactions in adiabatic quantum computation. Part 2: The 
"split-reduc" method and its application to quantum determination of Ramsey numbers 

Emile Okada^ Q 

^Department of Mathematics, Cambridge University, CB2 3AP, Cambridge, UK. 

Richard Tanburn^ 0 

^Mathematical Institute, Oxford University, 0X2 6GG, Oxford, UK. 

Nikesh S. Dattani^d,0 

^School of Materials Science and Engineering, Nanyang Technological University, 639798, Singapore, and 
'^Fukui Institute for Fundamental Chemistry, 606-8103, Kyoto, Japan 

Quantum annealing has recently been used to determine the Ramsey numbers R{m, 2) for 4 < 
m < 8 and i?(3,3) [Bian et al. (2013) PRL 111, 130505]. This was greatly celebrated as the largest 
experimental implementation of an adiabatic evolution algorithm to that date. However, in that com¬ 
putation, more than 66% of fhe qubits used were auxiliary qubits, so the sizes of the Ramsey number 
Hamiltonians used were tremendously smaller than the full 128-qubit capacity of the device used. 

The reason these auxiliary qubits were needed was because the best quantum annealing devices at 
the time (and still now) carmot implement multi-qubit interactions beyond 2-qubit interactions, and 
they are also limited in their capacity for 2-qubit interactions. We present a method which allows the 
full qubit capacity of a quantum annealing device to be used, by reducing multi-qubit and 2-qubit 
interactions. With our method, the device used in the 2013 Ramsey number quantum computation 
could have determined i?(16,2) and i?(4,3) with under 10 minutes of runtime. 


I. INTRODUCTION 

The capacities and limits for adiabatic quantum com¬ 
puters (AQCs) to outperform classical computers, and 
to speed-up the solution to discrete optimization prob¬ 
lems has recently been discussed in |ll]. As discussed 
in 0], the quantum annealing devices with the largest 
qubit capacities tend only to allow up to at most 2- 
qubit interactions, and are even limited in the 2-qubit 
interactions allowed. Similarly, even when solving a 
discrete optimization problem on a classical computer, 
high-order terms rapidly make the problem more diffi¬ 
cult. If only up to linear terms (1 qubit terms) are present 
in the Hamiltonian (objective frmction), then finding the 
solution to the problem is trivial, but if quadratic terms 
(2-qubit terms) are allowed the problem becomes NP 
complete. 

Nevertheless, an enormous body of work has been 
done on efficient algorithms for quadratic uncon¬ 
strained Boolean optimization (QUBO) problems, and 
it is known that if all coefficients of quadratic terms 
are n^ative, the solution can be found in polynomial 
time When cubic (3-qubit) terms and beyond are 
present, another leap in difficulty arises, and most of 
the effort is typically spent on quadratizing such ob¬ 
jective frmctions (Hamiltonians). Most quadratization 
techniques work by adding auxiliary variables (qubits), 
and while algorithms for finding solutions to discrete 


* eto25@cam.ac.uk 

''' richard.tanburn@hertford.ox.ac.uk 

t nike.dattani@gmail.com 


optimization problems often scale exponentially with 
the number of variables, it is still often desirable to re¬ 
move cubic terms and higher at the expense of adding 
more variables. 

However, quantum annealing devices to date are very 
limited in qubit capacity (the largest device reported to 
date having only about 2 kiloqubits or 258 qubytes). 
Therefore, adding auxiliary qubits is usually not an op¬ 
tion if any benefit over traditional computation meth¬ 
ods is desired for any relevant problem. In Part 1 jUj 
we demonstrated a method called "deduc-reduc" which 
reduces multi-qubit interactions without adding auxil¬ 
iary qubits, and for the integer factorization problem, 
managed to eliminate thousands of 4-qubit and 3-qubit 
interactions with just a few seconds of CPU time. A 
drawback of this method is that some deduction must 
be made which relates the variables of the discrete op¬ 
timization problem (an example of such a deduction 
could be a:i -I- X 2 = 1). Such deductions arise naturally 
for the problem of integer factorization, but there is no 
reason to believe that such deductions can be made for 
an arbitrary discrete optimization problem. 

In this paper we present a method for reducing multi¬ 
qubit interactions without adding auxiliary qubits and 
without the need for any deductions, but it increases the 
number of objective functions that need to be minimized 
to find the solution to the original objective function, 
and adding auxiliary qubits improves the method. We 
call this method "split-reduc" since it iteratively splits 
the Hamiltonian into separate Hamiltonians in order to 
reduce multi-qubit terms. We give very conservative 
lower and upper bounds on the number of new objec¬ 
tive functions created, and we showcase split-reduc on 
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the Hamiltonian used in the determination of Ramsey 
numbers using quantum annealing, as in @]. 

II. A QUICK EXAMPLE 

Let us demonstrate the method with the simple objec¬ 
tive function = \ + XiX 2 X 5 +XiXQX^X^+X^XiX^-XiX^XA 
and an adiabatic quantum computer (AQC) that only 
has 8 qubits and only allows up to at most 2-qubit inter¬ 
actions. Due to the restriction on the number of qubits 
we carmot reduce the qubic terms to quadratic terms by 
introducing auxiliary variables. The simple, but effec¬ 
tive idea is then to "split" the objective function into 
two by setting a variable to its two possible values (0 
or 1). In this case xi is the obvious choice to split over 
since it is present in the most terms and contributes to 
the quartic term. Setting xi to 0 results in the objective 
function Hq = 1 + x^x^x^ and setting xi to 1 results in 
Hi = l + X2X5 + xgxrxs + X3X4XS - X3X4. 

Hq still contains cubic terms so we have the choice to 
split Ho further. However at this point we have 5 un¬ 
used qubits so we could also stop here by using one of 
them as an auxiliary variable to quadratize the objec¬ 
tive function. iJi on the other hand is however still a 
bit too complicated for our quantum computer to han¬ 
dle. It contains cubic terms and requires 7 qubits out 
of the 8 qubit capacity of the AQC, so we split again, 
this time over x^. We get the objective functions Hiq = 
1 + X2X5 + xoxr and Hu = I + X2X3 + X3X4 . Both only con¬ 
tain quadratic terms so we have succeeded in turning 
our Hamiltonian into 3 separate Hamiltonians that can 
each be implemented on the AQC. In general, we can re¬ 
duce the number of splits necessary, by combining this 
approach with established methods for quadratization 
techniques that introduce auxiliary variables. 

III. THE METHOD 

We now demonstrate the method in full generality. 
We first define two cost functions: 

1. C{H) tells us whether or not we need to split the 
Hamiltonian any further, and 

2. Cnixi) tells us which variable to choose for the 
splitting at each step, by assigning a cost to each 
variable. 

The idea is that we keep splitting the Hamiltonian, 
according to the variable selected from Cn(xi), imtil 
C{H) is true. 

A. Choosing C{H) 

Different problems may involve different constraints. 
If a device can only handle 2-qubit interactions (such 


as SQUID-based quantum armealers as in 01) we might 
want a different C{H) than if the device can handle 3- 
qubit interactions (such as NMR-based AQCs as in @|). 
If we cannot, or do not want to add any auxiliary vari¬ 
ables, then we do not need the function C{H). 

If we wish to allow the addition of auxiliary variables, 
then for each term t in H, we determine how many aux¬ 
iliary variables naux,t will be needed in order to reduce 
t to our desired order (quadratic order for a device that 
allows 2-qubit interactions, cubic order for a device that 
allows 3-qubit interactions, etc.). The function is then 

(7(11) = n+^naux.t < Q, (1) 

t 

where n is the original number of qubits before any aux¬ 
iliary qubits were added and Q is our AQC's qubit ca¬ 
pacity. 

Since the most successful quantum annealing exper¬ 
iments performed thus far have been on architectures 
which do not allow higher than 2-qubit interactions, we 
will give an example of how to choose C{H) for such 
a device. There are many different ways to quadratize 
a term t, and each of these methods will have its own 
riaux.t/ but we know from 0 that riaux.i will not be more 
than 

^aux,i “ 7?.(order(t) - 2 ), (2) 

where TZ is the Ramp function (see Appendix for de¬ 
tails about the quadratization method which only needs 
at most this many auxiliary variables). For terms that 
are already quadratic, linear, or constant, order(t) < 2 so 
TZ (order(t) - 2) = 0 and no auxiliary variables are neces¬ 
sary. If t is, for example, quintic, then TZ (order(t) - 2) = 
7?.(5-2) = 3so the maximum number of auxiliary qubits 
added to the cost function in Eq. [T]is 3. 

Therefore, if our goal is to quadratize the Hamiltonian 
for a device that only allows up to 2-qubit interactions, 
and we are limited to only Q total qubits, then the cost 
relation is 

C{H) = n + ^TZ{oTdeT{t)-2) < Q. (3) 

t 

B. Choosing (7fr(a:i) 

As in the previous section, our choice of Cnixi) de¬ 
pends on the situation. We may wish to only have 
quadratic terms without introducing any auxiliary vari¬ 
ables, or we may want to choose a cost fimction that 
picks the variable that appears most frequently in the 
undesired (super-quadratic) terms. If we choose Eq. 
13 to be our cost formula, we may wish to choose a 
greedy Cnixi) that simply minimizes the number of 
auxiliary variables that would need to be added in order 
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to quadratize it. In conjunction with the cost formula in 
Eq. 13 we may define: 

= Y, [Ixi,fT^{order{t) - 2 + 1)], (4) 

t 

where Ixi,t is 1 if xt appears in t and 0 otherwise. 
The indicator function makes sure we only count terms 
in which Xi appears, and TZ (order(f) - 2) is of course 
the maximum number of auxiliary variables needed to 
quadratize term t, but we include +1 to account for 
when the variable is set to 1. For the splitting, we then 
choose the variable Xi with the biggest Cnixi). 

IV. ESTIMATES ON THE NUMBER OE SPLITTINGS 

In Section |n] the benefits of splitting were clear. We 
only needed 3 objective fimctions in the end, which is a 
small fraction of the search space of size 2®. But what 
about in general? It is not difficult to construct cases in 
which the number of splits required blows up. How¬ 
ever, this is often not the case. We may think of the 
splitting process as giving rise to a binary tree. The root 
of fhe tree is the original objective function and each 
node has two branches or zero branches (from splitting 
or not splitting respectively). Establishing tight analytic 
bounds on the number of leaves may seem tricky, but 
with simple assumptions, we show that we can estimate 
upper and lower bounds remarkably well. 

A. Heuristic bounds 

Let us start with the most basic lower bound we can 
imagine. We can assume that the shortest path from 
the root of the tree to a Hamiltonian that satisfies all 
hardware requirements is found by successively choos¬ 
ing fhe variable with the highest cost and setting it to 0. 
While false in cases like iT = (1-xi)( 1 - 2 : 2 )(l-xa) where 
setting any variable to 1 is preferable to setting it to 0, it 
is usually true when a lot of the monomials have the 
same sign or many terms do not share variables. Like¬ 
wise, we can assume that the longest path is found by 
setting the highest cost variable to 1 at each split. 

Provided the above conditions hold, finding the 
lengths of fhe extreme paths then becomes trivial and 
requires at most n substitutions. Once we know these 
lengths, call them I and s for the longest and shortest 
path respectively, we know a lower boimd is 2® and an 
upper bound is 2K 

B. More sophisticated estimates based on combinatorics 

The above bounds are not very tight, so we formu¬ 
late a more sophisticated estimate, and we ensure that 
the method tends to overestimate the number of splits. 


Let us make the stronger assumption that if s variables 
were set to 0 to obtain the shortest path, then setting s 
variables to 0 will always be sufficient to obtain a Hamil¬ 
tonian that satisfies the hardware requirements (a “de¬ 
sirable Hamiltonian"). The reason this tends to overes¬ 
timate (and hence could be considered an upper bound) 
is that it ignores the fact that setting a variable to 1 also 
helps simplify the Hamiltonian. Using the same number 
of operations as before, we can now find better bounds. 
We know that either s variables are set to 0 to obtain a 
desirable Hamiltonian, or I variables have been set to 0 
or 1 (since I is the length of the longest path). To count 
the number of such paths consider an 1-bit string 

XiX2...Xl. (5) 

There are (^) ways to choose at which stage the s vari¬ 
ables are set to 0. Filling in all the blank spaces before the 
last 0 with I's and leaving the rest empty characterizes 
all desirable Hamiltonians in which s variables were set 
to 0. If k variables are set to 0 where k < s, then there 
are (^) desirable Hamiltonians since all we need is that I 
variables have been given a value. Thus, the number of 
Hamiltonians is 



What happens if we do not ignore the reducing po¬ 
tential of setting a variable to 1? Suppose Ri right 
moves (setting variables to 1) reduces a Hamiltonian 
as much as Ri left moves (setting variables to 0). We 
want to count the number of paths where k left moves 
are made. Since k < s, left moves alone will not sim¬ 
plify our Hamiltonian. We will need Rg-k right moves 
to make up the difference. However we can include a 
full Rs-k+i - 1 right moves since without that last right 
move the Hamiltonian will not be desirable. That means 
we have k + Rg-k+i - 1 slots to fill with k left moves and 
Rs-k+i - 1 right moves. The number of such paths is 
simply and so the total number of desirable 

Hamiltonians is 

^^P^IRs.k.i-l + k)^ (7) 

This too is likely a slight over-estimate since in prac¬ 
tice we often encounter Hamiltonians that don't require 
exactly Ri right moves, but rather some number in the 
neighborhood of Ri. Now only one issue remains: cal¬ 
culating Ri. The authors of this paper prefer over¬ 
estimates to bad estimates, so we shall try to find Ri that 
are likely larger than they need to be. We start by suc¬ 
cessively making right moves until a desirable Hamilto¬ 
nian is reached. Before each right move, however, we 
note how many left moves would be needed to reach a 
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Table I. Performance of split-reduc on R(4,3) 


Number of vertices 

Total size of search space 

# of Hamiltonians needed with Q = 128 

Upper bound from SectionllV Bl 

6 

215 

1 

1 

7 

221 

9 

9 

8 

228 

169 

187 

9 

236 

6 716 

9 097 

Number of vertices 

Total size of search space 

# of Hamiltonians needed with Q = 50 

Upper bound from SectionllV Bl 

6 

215 

9 

9 

7 

221 

126 

156 

8 

228 

3 367 

3 893 

9 

236 

177 754 

346 758 

Number of vertices 

Total size of search space 

# of Hamiltonians needed with Q = 30 

Upper bound from SectionllV Bl 

6 

215 

24 

27 

7 

221 

398 

573 

8 

228 

13 389 

22 246 

9 

236 

829 055 

1 932 743 


desirable Hamiltonian from this point and thus gener¬ 
ate a sequence of length I + 1 (the number of nodes on 
the path). If the sequence is non-increasing, this method 
is likely to produce a good estimate since it conforms 
to the assumptions we made. We then define Ri to be 
the position of the last occurrence of s - f + 1 in the se¬ 
quence since that is the point at which adding a right 
move would remove the need for a left move. If the se¬ 
quence is not non-increasing then this will just produce 
a higher upper bound and if the sequence skips a num¬ 
ber (by for example, decreasing by two), we define Ri 
instead to be the last occurrence of a number larger than 
s - f + 1. This procedure involves 0{n^) steps since the 
max length of any path is n. 

To demonstrate the idea, we consider the objective 
function from Section HH The shortest path is s = 1 and 
the longest is Z = 2 (see Fig. 1). The sequence generated 
by the above procedure is (1,1,0) so = 2. That means 
the number of splits is 1 + ( ) = 3, which happens to 

be correct! 


PERFORMANCE ON RAMSEY NUMBER 
HAMILTONIANS 

It has been shown in |@, [^ that finding the Ramsey 
number R{m,n) is equivalent to finding what num¬ 
ber of vertices is needed for the ground state of a cer¬ 
tain Hamiltonian to have an energy of greater than 0. 
For each (m, n) a Hamiltonian is made to be associated 


1 + 2:1x23:5 + X1X6X7X8 
+X3X4X8 - X1X3X4 



1 + X3X4X8 1 + X2X5 + X6X7X8 
+X3X4X8 - X3X4 



I + X2X5 + X6X7 I + X2X5 + X3X4 
Figure 1. A tree representation of the splitting of /. 


with a graph G with N vertices, and cormts the num¬ 
ber of complete subgraphs Km, and n-independent sets. 
The first number N such that the global minimum of 
H{m,n,N) is not 0, is defined as the Ramsey number 
R(m, n). 


C. R{m,2) 

The largest Ramsey number determined by the quan¬ 
tum armealing device in @1 was R{8, 2) = 8. The Hamil¬ 
tonian for this case is: 
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Lm Ljjn / j-n\ 

if = ^ 1 - flfc + n «fe, = = 28 (8) 

k=l k=l ' ^ 

and it is clear we need to deal with a 28 qubit interaction, 
because the second term is a product of 28 qubits. In 
|@] they introduce auxiliary variables. We will use split- 
reduc instead. 

Due to the complete symmetry of all the variables we 
can pick one at random at each step and split it. If we 
choose not to allow auxiliary variables, and aim to split 
H until it is quadratic, we end up with the following 
Hamiltonians after splitting: 

28 

1+ ^ l-flfc for 1 < i < 26 and 2-027-028 + 027028 - (9) 

k=l+i 

If we had used Eq. [3to predict the number of objective 
functions we would have found that s = 1, and I = Ri = 
26. That would mean the number of Hamiltonians is 
1 + = 1 + 26 = 27, which also happens to be correct! 

For R{m,2) in general the combinatorial estimates in 
Section IIV Bl are provably correct, and s = 1, I = Ri = 
(™) - 2. Thus with the 128 qubits available in the quan¬ 
tum armealing device of |@], the authors could also have 
calculated i?(16,2) with at most ( 2 ^) - 2 = 118 runs. Sec¬ 
tion F of the Supplementary Information of @] explains 
that the annealing runtimes tend to be around 2.5 ms, 
which includes a 1.5 ms delay for reading out the answer 
from the machine. Therefore, 118 runs on the device is 
feasible within 1 second. It is clear that the quantum an¬ 
nealing device of |@] is not so much rimtime limited for 
this problem, as it is limited by the qubit capacity. We 
also note that 118 runs on the quantum annealing de¬ 
vice is 35 orders of magnitude smaller than the size of 
the total search space if a brute force search were to be 
attempted to find i?(16,2). 

Furthermore, while i?(16,2) was the largest R{m,2) 
Ramsey number that could have been determined by 
the 128 qubit device used in |@] had they used split- 
reduc, we note that the newest version of that device has 
a qubit capacity of Q = 2048, meaning that we could now 
determine i?(64,2) which requires ( 2 "^) = 2016 qubits, 
and would only require at most - 2 = 2014 runs on 
the device. 


D. R{m,3) 

The R{m,2) numbers have very simple objective 
functions. Ramsey numbers are much more 

complicated, and the only one that was found in 0 was 
i?(3,3). The Hamiltonian for i?(4,3) for each N is too 
lengthy to present here, but can be derived from @] and 
has at most 6-qubit interactions. Therefore, we apply 
split-reduc with Eq. [3]as our choice of C{H) and Eq. 0] 


as our choice of Ch- Table U shows how close our over¬ 
estimates from Section FlV Bl are for i?(4,3), where we 
know that the required number of vertices (and hence 
the Ramsey number itself) is 9. While minimizing 6716 
Hamiltonians would only take a few seconds on the 
quantum annealing device of |@], we note that this de¬ 
vice has another restriction, which was not relevant for 
R{m,2) because every term after split-reduction was 
linear at most (except for the last one). While the split- 
reduced i?(4,3) Hamiltonians meet the requirement that 
they are all quadratic at most, the quantum device of @] 
also requires that the quadratic couplings can be imple¬ 
mented on their "chimera" graph. 

In their example, i?(8,2) with N = 8 could be deter¬ 
mined with a Hamiltonian that after quadratization had 
54 qubits, and required 30 more qubits to chimerize the 
connectivity of the 54-vertex graph describing the con¬ 
nections between all qubits in the Hamiltonian. There¬ 
fore, for i?(4,3), if we choose the case in TableUthat uses 
at most Q = 50 qubits in the split-reduced Hamiltoni¬ 
ans, it is reasonable to assume that each of the resulting 
177 754 Hamiltonians could be chimerized using the 72 
qubits remaining in the 128-qubit device. Once again, if 
each minimization again took 2.5 ms, i?(4,3) would be 
determined within 10 minutes. 


V. CONCLUSION 

This is the second paper of a 2-part series on tech¬ 
niques for reducing multi-variable terms in discrete op¬ 
timization problems. The first method is called "deduc- 
reduc" because it uses deductions to reduce the multi¬ 
qubit (multi-variable) terms in the Hamiltonian (objec¬ 
tive function), and is presented in lU] with an applica¬ 
tion to the quantum factorization of numbers larger than 
56153, which is currently the largest number factored on 
a quantum device m. Deduc-reduc can also be used 
to reduce multi-qubit interactions in the Ramsey num¬ 
ber Hamiltonians discussed in the present paper, but 
we wished to focus only on the split-reduc method in 
this paper. Combining deduc-reduc, split-reduc, and a 
third algorithm we have recently devised for reducing 
the size of the search space for the Ramsey number dis¬ 
crete optimization problem, we are able to establish esti¬ 
mated runtimes for some of the presently undetermined 
Ramsey numbers such as R{6, 4), and i?(10,3) 10. 


APPENDIX: QUADRATIZATION METHOD NEEDING 
AT MOST TZ (order(t) - 2) AUXILIARY QUBITS 

One way to quadratize a high-order term is to use the 
penalty function presented in Section II of the Supple¬ 
mentary Information of 0: 

P{ai,a2]b) = aia 2 - 2(ai + 02)6 + 36, (10) 
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which obtains a minimum of 0 only if 6 = aia 2 - Therefore 
if our Hamiltonian has a high-order term such as: 


010203... On, (11) 

we can reduce its order by one, by replacing 0102 with a 
new variable b: 


010203 .. .o„ ^ 603 .. .o„ + AP(oi, 02 ; 6 ), ( 12 ) 

for a scalar A that is sufficiently large to not introduce 
any spurious minima (this is the "deduc-reduc" method 
of Part 1 of this paper 111], with 6 = 0102 as the deduction, 
and the choice of A is discussed there). By construc¬ 
tion, whether the LHS or RHS of Eq. [T2| is considered. 


the unique minimum/minima will be the same, but the 
LHS has order n and the RHS does not have any terms 
greater than order n-1. 

Our reduced term 

603 ... o„ (13) 

can then be further reduced by choosing another 2 vari¬ 
ables to transform. Repeatedly applying this method al¬ 
lows us to quadratize a term t with at most order(t) - 2 
applications, which explains Eq. [2]in the main text. 
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